perm filename PATREC[4,KMC]8 blob
sn#099757 filedate 1974-05-02 generic text, type T, neo UTF8
00100
00200
00300 PATTERN-MATCHING RULES FOR THE RECOGNITION OF
00400 NATURAL LANGUAGE DIALOGUE EXPRESSIONS
00500
00600 Kenneth Mark Colby
00700 and
00800 Roger C. Parkison
00900
01000
01100 INTRODUCTION
01200
01300 To recognize something is to identify it as an instance of
01400 the "same again". This familiarity is possible because of recurrent
01500 characteristics of the world which repeat themselves over and over
01600 again. We shall describe an algorithm which recognizes recurrent
01700 characteristics of natural language dialogue expressions. It utilizes
01800 a multi-stage sequence of pattern-matching rules for progressively
01900 transforming these input expressions into a pattern which eventually
02000 best matches a more abstract stored pattern. The name of the stored
02100 pattern has a pointer to the name of a response function which
02200 decides what to do once the input has been characterized.
02300 Here we shall discuss only the recognizing functions, except for one
02400 response function (anaphoric substitution) which interactively aids
02500 the characterization process. How the response functions operate
02600 will be described in a future communication.
02700 In constructing and testing a simulation of paranoid
02800 processes, we were faced with the problem of reproducing paranoid
02900 linguistic behavior in a diagnostic psychiatric interview. The
03000 diagnosis of paranoid states, reactions or modes is made by
03100 clinicians who judge a degree of correspondence between what they
03200 observe linguistically in an interview and their conceptual model of
03300 paranoid behavior. There exists a high degree of agreement among
03400 psychiatrists about this conceptual model which relies mainly on what
03500 an interviewee says and how he says it.
03600 Natural language is a life-expressing code people use for
03700 communication with themselves and others. In a real-life dialogue
03800 such as a psychiatric interview, the participants have interests,
03900 intentions, and expectations which are revealed in their linguistic
04000 expressions. To produce effects on an interviewer which he would
04100 judge similar to the effects produced by a paranoid patient , an
04200 interactive simulation of a paranoid patient must be able to
04300 demonstrate typical paranoid interview behavior. To achieve the
04400 desired effects, the paranoid model must have the ability to deal
04500 with the linguistic behavior of the interviewer.
04600 There are a number of approaches one might consider for an
04700 ideal handling of dialogue expressions. One approach would be
04800 to construct a dictionary of all expressions which could possibly
04900 arise in an interview. Associated with each expression would be its
05000 interpretation, depending on dialogue context, which in turn must
05100 somehow be defined. For obvious reasons, no one takes this approach
05200 seriously. Instead of an expression dictionary, one might
05300 construct a word dictionary and then use projection rules to yield an
05400 interpretation of a sentence from the dictionary definitions. This
05500 has been the approach adopted by most linguistic parsers based on
05600 transformational or systemic grammars. Such a method performs
05700 adequately as long as the dictionary involves only a few hundred
05800 words, each word having only one or two senses, and the dialogue is
05900 limited to a mini-world of a few objects and relations. But the
06000 problems which arise in a real-time psychiatric interview conducted
06100 in unrestricted English are too great for this method to be useful in
06200 actual dialogues requiring immediate responses.
06300 There is little consensus knowledge about how humans process
06400 natural language. They can be shown to possess some knowledge of
06500 grammar rules but this does not entail that they use a grammar in
06600 interpreting and producing language. Irregular verb-tenses and
06700 noun-plurals do not follow grammatical rules; yet people use
06800 thousands of them. One school of linguistics believes that people
06900 possess full transformational grammars for processing language. In
07000 our view this position seems implausible. Language is what is
07100 recognized but the processes involved may not be linguistic or
07200 grammatical. Originally transformational grammars were not designed
07300 to "understand" a large subset of English; they represented a set of
07400 axioms for deciding whether a string is grammatical. Efforts to use
07500 them for other purposes have not been very fruitful.
07600 An analysis of what one's problem actually is should guide
07700 the selection or invention of methods appropriate to that problem's
07800 solution. Our problem was not to develop a consistent and general
07900 theory of language nor to assert empirically testable hypotheses
08000 about how people process language. Our problem was to design an
08100 algorithm which recognizes what is being said in a dialogue and what
08200 is being said about it in order to make a response such that a sample
08300 of I-O pairs from the paranoid model is judged similar to a sample of
08400 I-O pairs from paranoid patients. From the perspective of
08500 artificial intelligence as an attempt to get computers to perform
08600 human intellectual tasks, new methods had to be devised for the task
08700 of participating in a human dialogue in a paranoid-patient-like way.
08800 We are not making an existence claim that our methods represent the
08900 way people process language. We sought effective methods which
09000 could operate efficiently in real time. Likewise, we would not deny
09100 that our methods for this skillful task are possibly analogous to the
09200 methods humans use. And since our methods provide a general way of
09300 many-to-one mapping from surface expressions to a single stored
09400 pattern, these methods can be used by any type of "host" system which
09500 takes natural language as input.
09600 To perform the task of managing communicative uses and
09700 effects of natural language, we developed a strategy of transforming
09800 the input until a pattern is achieved which matches completely or
09900 partially a more abstract stored pattern. This strategy has proved
10000 adequate for our purposes a satisfactory percentage of the time. (No
10100 one expects any method to be successful 100% of the time since not
10200 even humans, the best natural language processors around, achieve
10300 this level of performance). The power of this method for natural
10400 language dialogues lies in its ability to ignore unrecognizable
10500 expressions and irrelevant details. A conventional parser doing
10600 word-by-word, parts-of-speech analysis fails when it cannot find one
10700 or more of the input words in its dictionary. Because it must know
10800 every word, it is too fragile for unrestricted dialogues.
10900 In early versions of the paranoid model, (PARRY1), many (but
11000 not all) of the pattern recognition mechanisms allowed the elements
11100 of the pattern to be order independent. (Colby, Weber, and Hilf,
11200 1971). For example, consider the following expressions:
11300 (1) WHERE DO YOU WORK?
11400 (2) WHAT SORT OF WORK DO YOU DO?
11500 (3) WHAT IS YOUR OCCUPATION?
11600 (4) WHAT DO YOU DO FOR A LIVING?
11700 (5) WHERE ARE YOU EMPLOYED?
11800 In PARRY1 a procedure would scan these expressions looking for an
11900 information-bearing contentive such as "work", "for a living", etc.
12000 If it found such a contentive along with a "you" or "your" in the
12100 expression, regardless of word order, it would respond to the
12200 expression as if it were a question about the nature of one's work.
12300 (There is some doubt this even qualifies as a pattern since
12400 interrelations between words are ignored and only their presence is
12500 considered). An insensitivity to word order has the advantage that
12600 lexical items representing different parts of speech can represent
12700 the same concept,e.g. "work" as noun or as verb. But a price is paid
12800 for this resilience and elasticity. We found from experience that,
12900 since English relies heavily on word order to convey the meaning of
13000 it messages, the average penalty of misunderstanding (to be
13100 distinguished from ununderdstanding), was too great. Hence in PARRY2,
13200 as will be described shortly, the patterns require a specified word
13300 order.
13400 It seems agreed that for high-complexity problems it is
13500 useful to have constraints. Diagnostic psychiatric interviews
13600 (and especially those conducted over teletypes) have several natural
13700 constraints. First, clinicians are trained to ask certain questions
13800 in certain ways. These stereotypes can be treated as special cases.
13900 Second, only a few hundred standard topics are brought up by
14000 interviewers who are trained to use everyday expressions and
14100 especially those used by the patient himself. When the interview is
14200 conducted over teletypes, expressions tend to be shortened since the
14300 interviewer tries to increase the information transmission rate over
14400 the slow channel of a teletype. (It is said that short expressions
14500 are more grammatical but think about the phrase "Now now, there
14600 there.") Finally, teletyped interviews represent written speech.
14700 Speech is known to be highly redundant; hence many unrecognized words
14800 can be ignored without losing the meaning of the message. Written
14900 speech is loaded with idioms, cliches, pat phrases, etc. - all
15000 being easy prey for a pattern-matching approach. It is time-wasting
15100 and usually futile to try to decode an idiom by analyzing the
15200 meanings of its individual words.
15300 We shall now describe the pattern-matching functions of the
15400 algorithm in some detail.
15500
15600 PREPROCESSING
15700
15800 Each word in the input expression is first looked up in a
15900 dictionary of 1900 (currently) entries which, for the sake of speed,
16000 is maintained in core during run-time. The dictionary, which was
16100 built empirically from thousands of teletyped interviews with
16200 previous versions of the model, consists of words, groups of words,
16300 and names of word-classes they can be translated into. (See Fig. 1
16400 for illustrative examples of dictionary entries). The size of the
16500 dictionary is determined by the patterns, i.e. each dictionary word
16600 appears in one or more patterns. Entries in the dictionary
16700 reflect PARRY2's main interests. If a word in the input is not in the
16800 dictionary, it is dropped from the pattern being formed. Thus if the
16900 input were:
17000 WHAT IS YOUR CURRENT OCCUPATION?
17100 and the word "current" were not in the dictionary, the pattern at
17200 this stage becomes:
17300 ( WHAT IS YOUR OCCUPATION )
17400 The question-mark is thrown away as redundant since questions are
17500 recognized by word order. (A statement followed by a question mark
17600 (YOU GAMBLE?) is considered to be communicatively equivalent in its
17700 effects to that statement followed by a period.) Synonymic
17800 translations of words are made so that the pattern becomes, for
17900 example:
18000 ( WHAT BE YOU JOB )
18100 Groups of words are translated into a single word-class name, so
18200 that, for example, "for a living" becomes "job".
18300 Misspellings are handled in two ways. First, common
18400 misspellings of important words are simply put in the dictionary.
18500 Thus "yoy" is known to mean "you". The apostrophe is often omitted
18600 from contractions so most contractions are recognized with or without
18700 it. These common misspellings were gathered from over 4000 interviews
18800 with earlier versions of the paranoid model.
18900 Second, there are 5 common forms of typing error which are
19000 checked systematically. These are:
19100 1) Doubled letter
19200 2) Extraneous letter
19300 3) Forgetting to hold the "shift key" for an apostrophe
19400 4) Hitting a nearby key on the keyboard
19500 5) Transposing two letters in a word
19600 The first three errors can be corrected by deleting the offending
19700 character from the word. The fourth type of error is only checked
19800 for 10 of the more common near misses. These were also empirically
19900 determined and include letter pairs like (T Y), (O P), (A S), and (N
20000 M). These methods are all based on typing errors, but they also
20100 correct some legitimate English spelling errors. Two-letter
20200 transposition corrects, for example, "beleive" to "believe".
20300 Certain juxtaposed words are contracted into a single word,
20400 e.g. "GET ALONG WITH" becomes "GETALONGWITH". This is done to deal
20500 with groups of words which are represented as a single element in the
20600 stored pattern, thereby preventing segmentation from occurring at the
20700 wrong places, such as at a preposition inside an idiom. Besides these
20800 contractions, certain expansions are made so that for example,
20900 "DON'T" becomes "DO NOT" and "I'D" becomes "I WOULD".
21000
21100 SEGMENTING
21200
21300 Another weakness in the crude pattern matching of PARRY1 was
21400 that it took the entire input expression as its basic processing
21500 unit. Hence if only two words were recognized in an eight word input,
21600 the risk of misunderstanding was great. We needed a way of dealing
21700 with units shorter than the entire input expression.
21800 Expert telegraphists stay six to twelve words behind a
21900 received message before transcribing it. Translators wait until they
22000 have heard 4-6 words before they begin translating. Aided by a
22100 heuristic from work in machine-translation (Wilks, 1973 ), we devised
22200 a way of bracketing the pattern constructed up to this point into
22300 shorter segments using prepositions, wh-forms, certain verbs, etc. as
22400 bracketing points. (A list of the bracketing terms appears in Fig.
22500 2). The new pattern formed is termed either "simple", having no
22600 delimiters within it, or "compound", i.e., being made up of two or
22700 more simple patterns. A simple pattern might be:
22800 ( WHAT BE YOU JOB )
22900 whereas a compound pattern would be:
23000 (( WHY BE YOU ) ( IN HOSPITAL ))
23100 Our experience with this method of segmentation shows that compound
23200 patterns from teletyped psychiatric dialogues rarely consist of more
23300 than three or four segments.
23400 After certain verbs (See Fig. 3) a bracketing occurs to
23500 replace the commonly omitted "THAT", such that:
23600 ( I THINK YOU BE AFRAID )
23700 becomes
23800 (( I THINK ) ( YOU BE AFRAID ))
23900
24000 PREPARATION FOR MATCHING
24100
24200 Conjunctions serve only as markers for the segmenter and they
24300 are dropped out after segmentation.
24400 Negations are handled by extracting the "NOT" from the
24500 pattern and assigning a value to a global variable which indicates to
24600 the algorithm that the expression is negative in form. When a pattern
24700 is finally matched, this variable is consulted. Some patterns have a
24800 pointer to a pattern of opposite meaning if a "NOT" could reverse
24900 their meanings. If this pointer is present and a "NOT" is found,
25000 then the pattern matched is replaced by its opposite, e.g. ( I not
25100 trust you ) is replaced by the pattern ( I mistrust you ). We have
25200 not yet observed the troublesome case of "he gave me not one but two
25300 messages". (There is no need to scratch where it doesn't itch).
25400
25500 MATCHING AND RECYCLING
25600
25700 The algorithm now attempts to match the segmented patterns
25800 with stored patterns which currently number about 2200, 1700 being
25900 simple and 500 being compound. First a complete and perfect match is
26000 sought. When a match is found, the stored pattern name has a pointer
26100 to the name of a response function which decides what to do further.
26200 If a match is not found, further transformations of the pattern are
26300 carried out and a "fuzzy" match is tried.
26400 For fuzzy matching at this stage, we adopted the heuristic
26500 rule of dropping elements in the pattern one at a time and attempting
26600 a match each time. This heuristic allows ignoring familiar words in
26700 unfamiliar contexts. For example, "well" is important in "Are you
26800 well?" but meaningless in "Well are you?".
26900 Deleting one element at a time results in, for example, the
27000 pattern:
27100 ( WHAT BE YOU MAIN PROBLEM )
27200 becoming successively:
27300 (a) ( BE YOU MAIN PROBLEM )
27400 (b) ( WHAT YOU MAIN PROBLEM )
27500 (c) ( WHAT BE MAIN PROBLEM )
27600 (d) ( WHAT BE YOU PROBLEM )
27700 (e) ( WHAT BE YOU MAIN )
27800 Since the stored pattern in this case matches (d), (e) would not be
27900 constructed. We found it unwise to delete more than one element
28000 since our segmentation method usually yields segments containing a
28100 small (1-4) number of words.
28200 Dropping an element at a time provides a probalility
28300 threshold for fuzzy matching which is a function of the length of
28400 the segment. If a segment consists of five elements, four of the five
28500 must be present in a particular order (with the fifth element missing
28600 in any position) for a match to occur. If a segment contains four
28700 elements, three must match - and so forth.
28800 The transformations described above result in a progressive
28900 coarsening of the patterns by deletion. Substitutions are also made
29000 in certain cases. Some patterns contain pronouns which could stand
29100 for a number of different things of importance to PARRY2. As we
29200 mentioned in the introduction, there is one case in which the
29300 response functions of memory aid in language recognition. One
29400 function of memory is to keep track of the context in order to give
29500 pronouns and other anaphoras a correct interpretation. For example,
29600 the pattern:
29700 ( DO YOU AVOID THEM )
29800 could refer to the Mafia, or racetracks, or other patients, depending
29900 on the context. When such a pattern is recognized, the pronoun is
30000 replaced by its current anaphoric value as determined by the response
30100 functions, and a more specific pattern such as:
30200 ( DO YOU AVOID MAFIA )
30300 is looked up. In many cases, the meaning of a pattern containing a
30400 pronoun is clear without any substitution. In the pattern:
30500 (( HOW DO THEY TREAT YOU ) ( IN HOSPITAL ))
30600 the meaning of THEY is clarified by ( IN HOSPITAL ).
30700
30800 COMPOUND-PATTERN MATCH
30900
31000 When more than one simple pattern is detected in the input, a
31100 second matching is attempted. The deletion methods used are similar
31200 to the first matching except they occur at the segment level rather
31300 than at the single element level. Certain patterns, such as ( HELLO )
31400 and ( I THINK ), are dropped because they are considered meaningless.
31500 If a complete match is not found, then simple patterns are dropped,
31600 one at a time, from the complex pattern. This allows the input,
31700 (( HOW DO YOU COME ) ( TO BE ) ( IN HOSPITAL ))
31800 to match the stored pattern,
31900 (( HOW DO YOU COME ) ( IN HOSPITAL )).
32000
32100 If no match can be found at this point, the algorithm has
32200 arrived at a default condition and the appropriate response functions
32300 decide what to do. For example, in a default condition, the model
32400 may assume control of the interview, asking the interviewer a
32500 question, continuing with the topic under discussion or introducing a
32600 new topic.
32700 An example of interview dialgue is presented in the appendix
32800 showing the transformations the input expressions undergo and the
32900 patterns they match.
33000
33100 ADVANTAGES AND LIMITATIONS
33200
33300 As mentioned, one of the main advantages of a
33400 pattern-matching strategy is that it can ignore as irrelevant what it
33500 does NOT recognize. There are several million words in English,
33600 each possessing one to over a hundred senses. To construct a
33700 machine-usable word dictionary of this magnitude is out of the
33800 question at this time. Recognition of natural language input such
33900 as described above, allows real-time interaction in a dialogue since
34000 it avoids becoming ensnarled in combinatorial disambiguations and
34100 long chains of inferencing which would slow a dialogue algorithm down
34200 to impracticality, if it could even function at all. The price paid
34300 for pattern-matching is that sometimes, but rarely, ambiguities slip
34400 through. Eventually a rewrite interpreter must be added to pattern-
34500 matching rules to deal with ambiguities. (Enea and Colby,1973).
34600 A drawback to PARRY1 was that it reacted to the first pattern
34700 it found in the input rather than characterizing the input as fully
34800 as possible and then deciding what to do based on a number of tests.
34900 Another practical difficulty with PARRY1 from a programmer's
35000 viewpoint, was that elements of the patterns were strung out in
35100 various procedures throughout the algorithm. It was often a
35200 considerable chore for the programmer to determine whether a given
35300 pattern was present and precisely where it was. In the
35400 above-described method, the patterns are all collected in one part of
35500 the data-base where they can easily be examined.
35600 Concentrating all the patterns in the data base gives PARRY2
35700 a limited "learning" ability. When an input fails to match any
35800 stored pattern or matches an incorrect one, as judged by a human
35900 operator, a pattern which matches the input can be put into the
36000 data-base automatically. If the new pattern has the same meaning as
36100 a previously stored pattern, the human operator must provide the name
36200 of the appropriate response function. If he doesn't remember the
36300 name, he may try to rephrase the input in a form recognizable to
36400 PARRY2 and it will name the response function associated with the
36500 rephrasing. These mechanisms are not "learning" in the commonly used
36600 sense but they do allow a person to transfer his knowledge into
36700 PARRY2's data-base with very little redundant effort.
36800 Informal observation thus far shows PARRY2's linguistic
36900 recognition abilities to be quite superior to PARRY1's. A more
37000 systematic and quantitative evaluation of performance will be carried
37100 out in the future.
37200
37300
37400 REFERENCES
37500
37600 Colby, K.M., Weber, S., and Hilf, F.D. (1971). Artificial Paranoia.
37700 ARTIFICIAL INTELLIGENCE, 2, 1-25.
37800
37900 Enea, H. and Colby, K.M. (1973). Idiolectic Language-analysis
38000 For Understanding Doctor-patient Dialogues. THIRD
38100 INTERNATIONAL JOINT CONFERENCE IN ARTIFICIAL INTELLIGENCE.
38200 278-284.
38300
38400 Wilks, Y. (1973). An artificial intelligence approach to machine
38500 translation. In COMPUTER MODELS OF THOUGHT AND
38600 LANGUAGE, Schank, R. and Colby, K.M., Eds., W.H. Freeman,
38700 San Francisco.